Safe Policy Search with Gaussian Process Models

نویسندگان

  • Kyriakos Polymenakos
  • Alessandro Abate
  • Stephen Roberts
چکیده

We propose a method to optimise the parameters of a policy which will be used to safely perform a given task in a data-efficient manner. We train a Gaussian process model to capture the system dynamics, based on the PILCO framework. Our model has useful analytic properties, which allow closed form computation of error gradients and estimating the probability of violating given state space constraints. During training, as well as operation, only policies that are deemed safe are implemented on the real system, minimising the risk of failure.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Probabilistic Differential Dynamic Programming

We present a data-driven, probabilistic trajectory optimization framework for systems with unknown dynamics, called Probabilistic Differential Dynamic Programming (PDDP). PDDP takes into account uncertainty explicitly for dynamics models using Gaussian processes (GPs). Based on the second-order local approximation of the value function, PDDP performs Dynamic Programming around a nominal traject...

متن کامل

Safe Exploration for Active Learning with Gaussian Processes

In this paper, the problem of safe exploration in the active learning context is considered. Safe exploration is especially important for data sampling from technical and industrial systems, e.g. combustion engines and gas turbines, where critical and unsafe measurements need to be avoided. The objective is to learn data-based regression models from such technical systems using a limited budget...

متن کامل

Variational Bayesian Optimization for Runtime Risk-Sensitive Control

We present a new Bayesian policy search algorithm suitable for problems with policy-dependent cost variance, a property present in many robot control tasks. We extend recent work on variational heteroscedastic Gaussian processes to the optimization case to achieve efficient minimization of very noisy cost signals. In contrast to most policy search algorithms, our method explicitly models the co...

متن کامل

Governance: Blending Bureaucratic Rules with Day to Day Operational Realities; Comment on “Governance, Government, and the Search for New Provider Models”

Richard Saltman and Antonio Duran take up the challenging issue of governance in their article “Governance, Government and the Search for New Provider Models,” and use two case studies of health policy changes in Sweden and Spain to shed light on the subject. In this commentary, I seek to link their conceptualization of governance, especially its interrelated roles at the macro, meso, and micro...

متن کامل

Grid-search event location with non-Gaussian error models

Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Abstract This study employs an event location algorithm based on grid search to investigate the possibility of impr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1712.05556  شماره 

صفحات  -

تاریخ انتشار 2017